首页> 外文OA文献 >Disease named entity recognition by combining conditional random fields and bidirectional recurrent neural networks
【2h】

Disease named entity recognition by combining conditional random fields and bidirectional recurrent neural networks

机译:通过条件随机场和双向递归神经网络相结合的疾病命名实体识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The recognition of disease and chemical named entities in scientific articles is a very important subtask in information extraction in the biomedical domain. Due to the diversity and complexity of disease names, the recognition of named entities of diseases is rather tougher than those of chemical names. Although there are some remarkable chemical named entity recognition systems available online such as ChemSpot and tmChem, the publicly available recognition systems of disease named entities are rare. This article presents a system for disease named entity recognition (DNER) and normalization. First, two separate DNER models are developed. One is based on conditional random fields model with a rule-based post-processing module. The other one is based on the bidirectional recurrent neural networks. Then the named entities recognized by each of the DNER model are fed into a support vector machine classifier for combining results. Finally, each recognized disease named entity is normalized to a medical subject heading disease name by using a vector space model based method. Experimental results show that using 1000 PubMed abstracts for training, our proposed system achieves an F1-measure of 0.8428 at the mention level and 0.7804 at the concept level, respectively, on the testing data of the chemical-disease relation task in BioCreative V.
机译:科学文章中对疾病和化学命名实体的识别是生物医学领域信息提取中非常重要的子任务。由于疾病名称的多样性和复杂性,与化学名称相比,疾病命名实体的识别要困难得多。尽管在线上有一些出色的化学命名实体识别系统,例如ChemSpot和tmChem,但公开可用的疾病命名实体识别系统却很少。本文介绍了一种名为实体识别(DNER)和规范化的疾病系统。首先,开发了两个单独的DNER模型。一种是基于条件随机字段模型,带有基于规则的后处理模块。另一个基于双向递归神经网络。然后,将每个DNER模型识别的命名实体输入到支持向量机分类器中,以合并结果。最后,通过使用基于向量空间模型的方法,将每个公认的疾病命名实体标准化为医学主题疾病名称。实验结果表明,在BioCreative V中化学疾病关系任务的测试数据上,我们使用1000个PubMed摘要进行训练,我们的系统在提及级别上的F1测度分别为0.8428和概念级别上的0.784。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号